Prioritizing candidate disease genes by network-based boosting of genome-wide association data.

نویسندگان

  • Insuk Lee
  • U Martin Blom
  • Peggy I Wang
  • Jung Eun Shim
  • Edward M Marcotte
چکیده

Network "guilt by association" (GBA) is a proven approach for identifying novel disease genes based on the observation that similar mutational phenotypes arise from functionally related genes. In principle, this approach could account even for nonadditive genetic interactions, which underlie the synergistic combinations of mutations often linked to complex diseases. Here, we analyze a large-scale, human gene functional interaction network (dubbed HumanNet). We show that candidate disease genes can be effectively identified by GBA in cross-validated tests using label propagation algorithms related to Google's PageRank. However, GBA has been shown to work poorly in genome-wide association studies (GWAS), where many genes are somewhat implicated, but few are known with very high certainty. Here, we resolve this by explicitly modeling the uncertainty of the associations and incorporating the uncertainty for the seed set into the GBA framework. We observe a significant boost in the power to detect validated candidate genes for Crohn's disease and type 2 diabetes by comparing our predictions to results from follow-up meta-analyses, with incorporation of the network serving to highlight the JAK-STAT pathway and associated adaptors GRB2/SHC1 in Crohn's disease and BACH2 in type 2 diabetes. Consideration of the network during GWAS thus conveys some of the benefits of enrolling more participants in the GWAS study. More generally, we demonstrate that a functional network of human genes provides a valuable statistical framework for prioritizing candidate disease genes, both for candidate gene-based and GWAS-based studies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GenePANDA—a novel network-based gene prioritizing tool for complex diseases

Here we describe GenePANDA, a novel network-based tool for prioritizing candidate disease genes. GenePANDA assesses whether a gene is likely a candidate disease gene based on its relative distance to known disease genes in a functional association network. A unique feature of GenePANDA is the introduction of adjusted network distance derived by normalizing the raw network distance between two g...

متن کامل

GWAB: a web server for the network-based boosting of human genome-wide association data

During the last decade, genome-wide association studies (GWAS) have represented a major approach to dissect complex human genetic diseases. Due in part to limited statistical power, most studies identify only small numbers of candidate genes that pass the conventional significance thresholds (e.g. P ≤ 5 × 10-8). This limitation can be partly overcome by increasing the sample size, but this come...

متن کامل

genome - wide association data Prioritizing candidate disease genes by network - based boosting of

Material Supplemental http://genome.cshlp.org/content/suppl/2011/04/28/gr.118992.110.DC1.html P<P Published online May 2, 2011 in advance of the print journal. Preprint Accepted likely to differ from the final, published version. Peer-reviewed and accepted for publication but not copyedited or typeset; preprint is service Email alerting click here top right corner of the article or Receive free...

متن کامل

eResponseNet: a package prioritizing candidate disease genes through cellular pathways

MOTIVATION Although genome-wide association studies (GWAS) have found many common genetic variants associated with human diseases, it remains a challenge to elucidate the functional links between associated variants and complex traits. RESULTS We developed a package called eResponseNet by implementing and extending the existing ResponseNet algorithm for prioritizing candidate disease genes th...

متن کامل

Genome-wide haplotype association study identify TNFRSF1A, CASP7, LRP1B, CDH1 and TG genes associated with Alzheimer's disease in Caribbean Hispanic individuals

Alzheimer's disease (AD) is an acquired disorder of cognitive and behavioral impairment. It is considered to be caused by variety of factors, such as age, environment and genetic factors. In order to identify the genetic affect factors of AD, we carried out a bioinformatic approach which combined genome-wide haplotype-based association study with gene prioritization. The raw SNP genotypes data ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Genome research

دوره 21 7  شماره 

صفحات  -

تاریخ انتشار 2011